Boosting Algorithms as Gradient Descent in Function Space
نویسندگان
چکیده
Much recent attention, both experimental and theoretical, has been focussed on classii-cation algorithms which produce voted combinations of classiiers. Recent theoretical work has shown that the impressive generalization performance of algorithms like AdaBoost can be attributed to the classiier having large margins on the training data. We present abstract algorithms for nding linear and convex combinations of functions that minimize arbitrary cost functionals (i.e functionals that do not necessarily depend on the margin). Many existing voting methods can be shown to be special cases of these abstract algorithms. Then, following previous theoretical results bounding the generalization performance of convex combinations of classiiers in terms of general cost functions of the margin, we present a new algorithm (DOOM II) for performing a gradient descent optimization of such cost functions. Experiments on several data sets from the UC Irvine repository demonstrate that DOOM II generally outperforms AdaBoost, especially in high noise situations. Margin distribution plots verify that DOOM II is willing tògive up' on examples that are too hard in order to avoid overrtting. We also show that the overrtting behavior exhibited by AdaBoost can be quantiied in terms of our proposed cost function.
منابع مشابه
On Early Stopping in Gradient Descent Learning
In this paper, we study a family of gradient descent algorithms to approximate the regression function from Reproducing Kernel Hilbert Spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. Thes...
متن کاملBoosting Density Estimation
Several authors have suggested viewing boosting as a gradient descent search for a good fit in function space. We apply gradient-based boosting methodology to the unsupervised learning problem of density estimation. We show convergence properties of the algorithm and prove that a strength of weak learnability property applies to this problem as well. We illustrate the potential of this approach...
متن کاملGradient Tree Boosting for Training Conditional Random Fields
Conditional random fields (CRFs) provide a flexible and powerful model for sequence labeling problems. However, existing learning algorithms are slow, particularly in problems with large numbers of potential input features and feature combinations. This paper describes a new algorithm for training CRFs via gradient tree boosting. In tree boosting, the CRF potential functions are represented as ...
متن کاملA Robust Boosting Method for Mislabeled Data
Abstract We propose a new, robust boosting method by using a sigmoidal function as a loss function. In deriving the method, the stagewise additive modelling methodology is blended with the gradient descent algorithms. Based on intensive numerical experiments, we show that the proposed method is actually better than AdaBoost and other regularized method in test error rates in the case of noisy, ...
متن کاملTotally Corrective Multiclass Boosting with Binary Weak Learners
In this work, we propose a new optimization framework for multiclass boosting learning. In the literature, AdaBoost.MO and AdaBoost.ECC are the two successful multiclass boosting algorithms, which can use binary weak learners. We explicitly derive these two algorithms’ Lagrange dual problems based on their regularized loss functions. We show that the Lagrange dual formulations enable us to desi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999